Classifying Domain-Specific Terms Using a Dictionary
نویسندگان
چکیده
Automatically building domain-specific ontologies is a highly challenging task as it requires extracting domain-specific terms from a corpus and assigning them relevant domain concept labels. In this paper, we focus on the second task: i.e., assigning domain concepts to domain-specific terms. Motivated by previous approaches in related research (such as word sense disambiguation (WSD) and named entity recognition (NER)) that use semantic similarity among domain concepts, we explore three types of features — contextual, domain concepts, topics — to measure the semantic similarity of terms; we then assign the domain concepts from the best matching terms. As evaluation, we collected domainspecific terms from FOLDOC, a freely available on-line dictionary for the the Computing domain, and defined 9 domain concepts for this domain. Our results show that beyond contextual features, using domain concepts and topics derived from domain-specific terms helps to improve assigning domain concepts to the terms.
منابع مشابه
A Novel Image Denoising Method Based on Incoherent Dictionary Learning and Domain Adaptation Technique
In this paper, a new method for image denoising based on incoherent dictionary learning and domain transfer technique is proposed. The idea of using sparse representation concept is one of the most interesting areas for researchers. The goal of sparse coding is to approximately model the input data as a weighted linear combination of a small number of basis vectors. Two characteristics should b...
متن کاملAutomatic Construction of Semantic Dictionary for Question Categorization
An automatic method for building a semantic dictionary from existing questions in a pattern-based question answering system is proposed for question categorization. This dictionary consists of two main parts: Semantic Domain Terms (SDT), which is a domain specific term list, and Semantic Labeled Terms (SLT), which contain common terms tagged with semantic labels. The semantic dictionary is buil...
متن کاملHarvesting Domain-Specific Terms using Wikipedia
We present a simple but effective method of automatically extracting domain-specific terms using Wikipedia as training data (i.e. self-supervised learning). Our first goal is to show, using human judgments, that Wikipedia categories are domainspecific and thus can replace manually annotated terms. Second, we show that identifying such terms using harvested Wikipedia categories and entities as s...
متن کاملEnriching Slovene WordNet with domain-specific terms
The paper describes an innovative approach to expanding the domain coverage of wordnet by exploiting multiple resources. In the experiment described here we are using a large monolingual Slovene corpus of texts from the domain of informatics to harvest terminology from, and a parallel English-Slovene corpus and an online dictionary as bilingual resources to facilitate the mapping of terms to th...
متن کاملAutomatic Methods for the Extension of a Bilingual Dictionary using Comparable Corpora
Bilingual dictionaries define word equivalents from one language to another, thus acting as an important bridge between languages. No bilingual dictionary is complete since languages are in a constant state of change. Additionally, dictionaries are unlikely to achieve complete coverage of all language terms. This paper investigates methods for extending dictionaries using non-aligned corpora, b...
متن کامل